Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce mapping transformer to allow transform mappings during index create/update or index template create/update #17635

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bzhangam
Copy link
Contributor

Description

In this PR we introduce the mapping transformer to the core to allow transforming the mapping during index create/update and index template create/update. We want to do this because we have a use case in neural search plugin where we want to auto generate the semantic fields(e.g. knn_vector or rank_feature) in the mappings based on the model id defined in the semantic field(A new field type introduced by neural search plugin). In this way we can simplify the neural search set up.

Potentially in future this feature can be leveraged by other plugins to auto transform the mapping based on the certain config defined by users.

Related Issues

Resolves #[17500] - Proposal to introduce the mapping transformer into core.

#[803] - Proposal to support semantic field in neural search.
#[1211] - HLD of the semantic field.

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Contributor

❌ Gradle check result for 7d47df7: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bzhangam bzhangam force-pushed the introduceMappingTransformer branch 2 times, most recently from 6f9c569 to dc11eee Compare March 19, 2025 20:59
Copy link
Contributor

❌ Gradle check result for dc11eee: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bzhangam bzhangam force-pushed the introduceMappingTransformer branch from dc11eee to c35c386 Compare March 20, 2025 17:11
Copy link
Contributor

❌ Gradle check result for c35c386: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bzhangam bzhangam force-pushed the introduceMappingTransformer branch from c35c386 to 2406d5b Compare March 25, 2025 18:26
@bzhangam bzhangam requested a review from peternied as a code owner March 25, 2025 18:26
@bzhangam bzhangam closed this Mar 25, 2025
@bzhangam bzhangam force-pushed the introduceMappingTransformer branch from 2406d5b to 3ac7c23 Compare March 25, 2025 18:30
@bzhangam bzhangam reopened this Mar 25, 2025
@github-project-automation github-project-automation bot moved this from ✅ Done to 🏗 In progress in Cluster Manager Project Board Mar 25, 2025
Copy link
Contributor

❌ Gradle check result for b8204e8: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@bzhangam bzhangam force-pushed the introduceMappingTransformer branch 2 times, most recently from 6016e32 to 0103403 Compare March 25, 2025 19:55
Copy link
Contributor

❌ Gradle check result for 0103403: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Member

@saratvemulapalli saratvemulapalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the changes look good. Thanks for taking care of this.
Dropped a few comments.

builder.map(mapping);
listener.onResponse(builder.toString());
} catch (IOException e) {
listener.onFailure(new RuntimeException("Failed to parse the transformed mapping [" + mapping + "]", e));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is a catch all for the transformation, but ideally we should have bailed out where the root cause of the transformation is (to help with debugging), because we have already validated user mapping by this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not a catch all. It's more for the case that after the transformation it's possible some transformers corrupt the mapping and make it not a valid format. In that case we cannot parse the mapping back to a json string. The tricky part is that we don't know which transformer corrupt the mapping if there are multiple transformers because we don't want to parse the mapping in each transformer. Do you want to suggest a better error message?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was worried there would be no way to debug later. One option is to validate the parsing after each transformation. Do you know whats the worst case performance impact of parsing for each applyNext?. Its a trade off performance vs debuggable.
Possibly we can exposing a cluster setting which operators can tune and turn it off by default.

@@ -511,7 +515,17 @@ private <Request extends ClusterManagerNodeRequest<Request>, Response extends Ac
masterNodeAction,
request,
clusterState,
new PlainActionFuture<>()
new ActionListener<Response>() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what prompted this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PlainActionFuture will not throw the exception on failure but there is a unit test that wants to verify the exception thrown during index creation. Before my change the exception will be threw since in the TransportCreateIndexAction we don't wrap the code to create the index. But now it's wrapped by a listener.

@bzhangam bzhangam force-pushed the introduceMappingTransformer branch from 0103403 to 599b03a Compare April 10, 2025 20:30
…x create/update or index template create/update.

Signed-off-by: Bo Zhang <[email protected]>
@bzhangam bzhangam force-pushed the introduceMappingTransformer branch from 599b03a to ceabd64 Compare April 10, 2025 20:35
Copy link
Contributor

❕ Gradle check result for ceabd64: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Copy link

codecov bot commented Apr 10, 2025

Codecov Report

Attention: Patch coverage is 83.59375% with 21 lines in your changes missing coverage. Please review.

Project coverage is 72.57%. Comparing base (a4bc4bf) to head (077b13c).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...earch/index/mapper/MappingTransformerRegistry.java 80.48% 8 Missing ⚠️
...put/TransportPutComposableIndexTemplateAction.java 82.35% 1 Missing and 2 partials ⚠️
.../template/put/TransportPutIndexTemplateAction.java 85.00% 3 Missing ⚠️
...rg/opensearch/index/mapper/MappingTransformer.java 0.00% 3 Missing ⚠️
...plate/put/TransportPutComponentTemplateAction.java 87.50% 0 Missing and 2 partials ⚠️
...java/org/opensearch/cluster/metadata/Template.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #17635      +/-   ##
============================================
+ Coverage     72.44%   72.57%   +0.13%     
- Complexity    66801    66894      +93     
============================================
  Files          5455     5457       +2     
  Lines        309183   309281      +98     
  Branches      44987    44991       +4     
============================================
+ Hits         223981   224471     +490     
+ Misses        66873    66434     -439     
- Partials      18329    18376      +47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

❕ Gradle check result for 077b13c: UNSTABLE

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cluster Manager enhancement Enhancement or improvement to existing feature or request Plugins
Projects
Status: 🏗 In progress
Development

Successfully merging this pull request may close these issues.

[RFC] Introduce Mapping Transformer
3 participants